Post Summarization of Microblogs of Sporting Events
نویسندگان
چکیده
Every day 645 million Twitter users generate approximately 58 million tweets. This motivates the question if it is possible to generate a summary of events from this rich set of tweets only. Key challenges in post summarization from microblog posts include circumnavigating spam and conversational posts. In this study, we present a novel technique called lexi-temporal clustering (LTC), which identifies key events. LTC uses k-means clustering and we explore the use of various distance measures for clustering using Euclidean, cosine similarity and Manhattan distance. We collected three original data sets consisting of Twitter microblog posts covering sporting events, consisting of a cricket and two football matches. The match summaries generated by LTC were compared against standard summaries taken from sports sections of various news outlets, which yielded up to 81% precision, 58% recall and 62% F-measure on different data sets. In addition, we also report results of all three variants of the recall-oriented understudy for gisting evaluation (ROUGE) software, a tool which compares and scores automatically generated summaries against standard summaries.
منابع مشابه
Multiple Post Microblog Summarization
The use of microblogs such as Twitter has increased incredibly over the past few years. Because of the public nature and sheer volume of text from these constantly changing microblogs, it is often difficult to fully understand what is being said about various topics. A method for summarizing popular topics of microblogs has been proposed but its summaries are only one sentence or phrase in leng...
متن کاملDesigning a model for the impact of innovation on the consequences of hosting sporting events
Abstract The aim of this study was to investigate the effect of innovation on the outcome of hosting Iranian sporting events. This research was of developmental-applied type. And the statistical population included experts of sporting events.After confirming the validity of the researcher-made questionnaire by 20 experts, based on the final number of items in the questionnaire, 111 people were...
متن کاملEvaluating Ranking Diversity and Summarization in Microblogs using Hashtags
Diversification techniques for web search have recently been developed that assume that, for each query, there is a set of underlying aspects or subtopics that address specific user intents. These techniques attempt to balance the relevance of the retrieved documents with the coverage of the aspects. Evaluation of diversification techniques requires some way of defining a set of aspects for eac...
متن کاملA Model for Summarizing Celebrities with Microblogging Users' Interest
With the rapid development of Web 2.0, microblogging such as twitter is increasingly becoming an important source of up-to-date topics about what is happening in the world. Specially, it becomes one of the main ways for users to understand some celebrities. However, the huge volume of information makes troubles for people to get what they really want. How to filter out needless information thro...
متن کاملAutomatic Summarization of Real World Events Using Twitter
Microblogging sites, such as Twitter, have become increasingly popular in recent years for reporting details of real world events via the Web. Smartphone apps enable people to communicate with a global audience to express their opinion and commentate on ongoing situations often while geographically proximal to the event. Due to the heterogeneity and scale of the data and the fact that some mess...
متن کامل